Dynamic Identification of Threat Level Associated With a Person Using an Audio/Video Recording and Communication Device

ABSTRACT

Dynamic identification of threat levels associated with persons using audio/video (A/V) recording and communication devices in accordance with various embodiments of the present disclosure are provided. In one embodiment, a method for notifying a user of a threat level associated with a person within the field of view of a camera of an A/V recording and communication device is provided, the method comprising receiving, from the camera, identification data for the person; transmitting the received identification data to at least one backend server; receiving, from the backend server, information about a threat level associated with the person; and notifying the user of the threat level.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to provisional application Ser. No.62/473,560, filed on Mar. 20, 2017. The entire contents of the priorityapplication are hereby incorporated by reference as if fully set forth.

TECHNICAL FIELD

The present embodiments relate to automatic identification of a threatlevel associated with a person and/or object near a structure usingaudio/video (A/V) recording and communication devices (e.g., A/Vrecording and communication doorbell systems, A/V recording andcommunication security systems, etc.).

BACKGROUND

Home security is a concern for many homeowners and renters. Thoseseeking to protect or monitor their homes often wish to have video andaudio communications with visitors, for example, those visiting anexternal door or entryway. Audio/Video (A/V) recording and communicationdevices, such as doorbells and security cameras, provide thisfunctionality, and can also aid in crime detection and prevention. Forexample, audio and/or video captured by an A/V recording andcommunication device can be uploaded to the cloud and recorded on aremote server. Subsequent review of the A/V footage can aid lawenforcement in capturing perpetrators of home burglaries and othercrimes. Further, the presence of one or more A/V recording andcommunication devices on the exterior of a home, such as a doorbell unitat the entrance to the home, acts as a powerful deterrent againstwould-be burglars.

SUMMARY

The various embodiments of the present dynamic identification of threatlevel associated with a person using an audio/video (A/V) recording andcommunication device have several features, no single one of which issolely responsible for their desirable attributes. Without limiting thescope of the present embodiments as expressed by the claims that follow,their more prominent features now will be discussed briefly. Afterconsidering this discussion, and particularly after reading the sectionentitled “Detailed Description,” one will understand how the features ofthe present embodiments provide the advantages described herein.

One aspect of the present embodiments includes the realization thatinformation about a person approaching a property (or lingering within avicinity of the property), such as the identity of the person, may beused to determine the level of a threat the person might be posing. Itwould be advantageous, therefore, if the functionality of an A/Vrecording and communication device associated with a property (e.g.,installed at a house) could be leveraged to identify a person at, ornear, the property. It would also be advantageous if the functionalityof A/V recording and communication device could be leveraged todynamically determine a threat level associated with the person, andprovide a notification (e.g., visual and/or audible notification) to theowner of the property (and/or to residents or occupants of the propertyand/or to any person present at the property) about the threat level.The present embodiments provide these advantages and enhancements, asdescribed below.

In a first aspect, a method for an audio/video (A/V) recording andcommunication device comprising a camera having a field of view isprovided, the method for notifying a user of a threat level associatedwith a person within the field of view of the camera, the methodcomprising receiving, from the camera, identification data for theperson; transmitting the received identification data to at least onebackend server; receiving, from the backend server, information about athreat level associated with the person; and notifying the user of thethreat level.

In an embodiment of the first aspect, notifying the user of the threatlevel comprises providing one of a plurality of different threat levelnotification types based on different threat levels.

In another embodiment of the first aspect, notifying the user of thethreat level comprises providing a visual notification to the user.

In another embodiment of the first aspect, the A/V recording andcommunication device is associated with a structure having at least onecolored light, wherein providing the visual notification comprisesselecting a particular color from a set of colors for the colored light.

In another embodiment of the first aspect, each color in the set ofcolors corresponds to a different threat level.

In another embodiment of the first aspect, notifying the user of thethreat level comprises providing an audible notification.

In another embodiment of the first aspect, notifying the user of thethreat level comprises providing a notification on a user's device.

In another embodiment of the first aspect, the user's device is asmartphone.

In another embodiment of the first aspect, the A/V recording andcommunication device is a doorbell.

In another embodiment of the first aspect, the A/V recording andcommunication device is a security camera.

In another embodiment of the first aspect, receiving the identificationdata for the person comprises detecting a movement of the person; andcapturing video images, by the camera, of the person.

In another embodiment of the first aspect, the identification data forthe person comprises one or more video images of the person.

Another embodiment of the first aspect further comprises notifying otherresidents of a same neighborhood, in which the A/V recording andcommunication device is located, of the threat level.

In another embodiment of the first aspect, wherein notifying the otherresidents comprises notifying the other residents through streetlightslocated in the neighborhood.

In a second aspect, an audio/video (A/V) recording and communicationdevice is provided, the A/V recording and communication devicecomprising a camera configured to record video image data of an areaabout the A/V recording and communication device; one or moreprocessors; a communication module configured to transmit streamingvideo to a client device; and a non-transitory machine readable mediumstoring a program which when executed by at least one of the processorsnotifies a user of a threat level associated with a person within afield of view of the camera, the program comprising sets of instructionsfor receiving, from the camera, identification data for the person;transmitting the received identification data to at least one backendserver; receiving, from the backend server, information about a threatlevel associated with the person; and notifying the user of the threatlevel.

In an embodiment of the second aspect, the set of instructions fornotifying the user of the threat level comprises a set of instructionsfor providing one of a plurality of different threat level notificationtypes based on different threat levels.

In another embodiment of the second aspect, the set of instructions fornotifying the user of the threat level comprises a set of instructionsfor providing a visual notification.

In another embodiment of the second aspect, the A/V recording andcommunication device is associated with a structure having at least onecolored light, wherein the set of instructions for providing the visualnotification comprises a set of instructions for selecting a particularcolor from a set of different colors for the colored light.

In another embodiment of the second aspect, each color in the set ofcolors corresponds to a different threat level.

In another embodiment of the second aspect, the set of instructions fornotifying the user of the threat level comprises a set of instructionsfor providing an audible notification to the user.

In another embodiment of the second aspect, the set of instructions fornotifying the user of the threat level comprises a set of instructionsfor providing a notification on a client device.

In another embodiment of the second aspect, the client device is asmartphone.

In another embodiment of the second aspect, the A/V recording andcommunication device is a doorbell.

In another embodiment of the second aspect, the A/V recording andcommunication device is a security camera.

In another embodiment of the second aspect, the set of instructions forreceiving the identification data for the person comprises sets ofinstructions for detecting a movement of the person within the field ofview of the camera; and capturing video images, by the camera, of theperson and a set of other objects that are within the field of view ofthe camera.

In another embodiment of the second aspect, the identification data forthe person comprises one or more video images of the person.

Another embodiment of the second aspect further comprises a set ofinstructions for notifying other persons living in a same neighborhood,in which the A/V recording and communication device is located, of thethreat level.

In another embodiment of the second aspect, the set of instructions fornotifying the other persons comprises a set of instructions fornotifying the other persons through streetlights located in theneighborhood.

In a third aspect, a method for identifying a threat level associatedwith a person within a vicinity of a structure is provided, the methodcomprising receiving identification data associated with the person froman audio/video (A/V) recording and communication device installed at thestructure; determining whether the person is identifiable by performinga computer vision process on the received identification data; uponidentification of the person, determining the threat level associatedwith the person; and sending the determined threat level to the A/Vrecording and communication device.

In an embodiment of the third aspect, the A/V recording andcommunication device comprises a camera, wherein the identification datacomprises one or more video images of the person captured by the camera.

In another embodiment of the third aspect, the camera captures the videoimages of the person when the A/V recording and communication devicedetects a movement of the person.

In another embodiment of the third aspect, the movement of the person isdetected by one or more motion sensors of the A/V recording andcommunication device.

In another embodiment of the third aspect, the identification datacomprises at least a video image of the person captured by a camera ofthe A/V recording and communication device.

In another embodiment of the third aspect, performing the computervision process comprises performing a face recognition process on theperson's face included in the video image.

In another embodiment of the third aspect, determining the threat levelassociated with the person comprises comparing the person's face with aplurality of faces stored in a database to identify the threat levelassociated with the person.

Another embodiment of the third aspect further comprises, when theperson is not identified, assigning a particular value to the threatlevel associated with the person to indicate that the person was notidentifiable.

In another embodiment of the third aspect, upon receiving the determinedthreat level, the A/V recording and communication device notifies a userof the threat level.

In another embodiment of the third aspect, notifying the user of thethreat level comprises providing a visual notification to the user.

In another embodiment of the third aspect, providing the visualnotification comprises selecting a particular color from a set ofdifferent colors for a colored light within the structure, wherein eachcolor in the set of colors corresponds to a different threat level.

In another embodiment of the third aspect, the A/V recording andcommunication device installed at the structure is one of a doorbell anda security camera.

In another embodiment of the third aspect, the method is performed by asmart-home hub device.

In another embodiment of the third aspect, the method is performed by abackend server.

In a fourth aspect, a non-transitory machine readable medium of a serveris provided, the non-transitory machine readable medium storing aprogram which when executed by at least one processing unit of theserver identifies a threat level associated with a person within avicinity of a structure, the program comprising sets of instructions forreceiving identification data associated with the person from anaudio/video (A/V) recording and communication device installed at thestructure; determining whether the person is identifiable by performinga computer vision process on the received identification data; uponidentification of the person, determining the threat level associatedwith the person; and sending the determined threat level to the A/Vrecording and communication device.

In an embodiment of the fourth aspect, the A/V recording andcommunication device comprises a camera, the identification datacomprises one or more video images of the person captured by the camera.

In another embodiment of the fourth aspect, the camera captures thevideo images of the person when the A/V recording and communicationdevice detects a movement of the person.

In another embodiment of the fourth aspect, the movement of the personis detected by one or more motion sensors of the A/V recording andcommunication device.

In another embodiment of the fourth aspect, the identification datacomprises at least a video image of the person captured by a camera ofthe A/V recording and communication device.

In another embodiment of the fourth aspect, the set of instructions forperforming the computer vision process comprises a set of instructionsfor performing a face recognition process on the person's face includedin the video image.

In another embodiment of the fourth aspect, the set of instructions fordetermining the threat level associated with the person comprises a setof instructions for comparing the person's face with a plurality offaces stored in a database to identify the threat level associated withthe person.

In another embodiment of the fourth aspect, the program furthercomprises a set of instructions for, when the person is not identified,assigning a particular value to the threat level associated with theperson to indicate that the person was not identifiable.

In another embodiment of the fourth aspect, upon receiving thedetermined threat level, the A/V recording and communication devicenotifies a user of the threat level.

In another embodiment of the fourth aspect, notifying the user of thethreat level comprises providing a visual notification to the user.

In another embodiment of the fourth aspect, providing the visualnotification comprises selecting a particular color from a set ofdifferent colors for a colored light within the structure, wherein eachcolor in the set of colors corresponds to a different threat level.

In another embodiment of the fourth aspect, the A/V recording andcommunication device installed at the structure is one of a doorbell anda security camera.

BRIEF DESCRIPTION OF THE DRAWINGS

The various embodiments of the present dynamic identification of thethreat level associated with a person using audio/video (A/V) recordingand communication devices now will be discussed in detail with anemphasis on highlighting the advantageous features. These embodimentsdepict the novel and non-obvious A/V recording and communication devicesshown in the accompanying drawings, and the methods which can beperformed with them, which are for illustrative purposes only. Thesedrawings include the following figures, in which like numerals indicatelike parts:

FIG. 1 is a functional block diagram illustrating a system for streamingand storing A/V content captured by an A/V recording and communicationdevice, and for providing to a user an indication of a threat level,according to various aspects of the present disclosure;

FIG. 2 is a schematic diagram of a structure, illustrating an example ofnotifying a person within the structure about a person at the entranceof the structure and the threat (if any) he or she might pose, accordingto some aspects of the present embodiments;

FIG. 3 is a schematic diagram of a structure, illustrating an example ofnotifying a person within the structure about an object placed near theentrance of the structure, according to some aspects of the presentembodiments;

FIG. 4 is a schematic diagram of a structure, illustrating an example ofnotifying an authorized person associated with a property (e.g., aproperty owner away from the property) about a threat level posed by aperson at, or near, the property, according to some aspects of thepresent embodiments;

FIG. 5 is a flowchart illustrating a process for detecting a personapproaching a property, determining a threat level associated with theperson, and notifying one or more persons associated with the propertyabout the threat level, according to some aspects of the presentembodiments;

FIG. 6 is a flowchart illustrating a process for receivingidentification data about a person at, or near, a property anddetermining a threat level associated with the person, according to someaspects of the present embodiments;

FIG. 7 is a front view of an A/V recording and communication doorbellaccording to an aspect of the present disclosure;

FIG. 8 is an upper front perspective view of an A/V recording andcommunication security camera according to an aspect of the presentdisclosure;

FIG. 9 is a flowchart illustrating a process for streaming and storingA/V content from an A/V recording and communication device according tovarious aspects of the present disclosure;

FIG. 10 is a flowchart illustrating another process for an A/V recordingand communication device according to an aspect of the presentdisclosure;

FIG. 11 is a functional block diagram of a client device on which thepresent embodiments may be implemented according to various aspects ofthe present disclosure;

FIG. 12 a functional block diagram of the components of the A/Vrecording and communication device of FIG. 7; and

FIG. 13 is a functional block diagram of a general-purpose computingsystem on which the present embodiments may be implemented according tovarious aspects of present disclosure.

DETAILED DESCRIPTION

The following detailed description describes the present embodimentswith reference to the drawings. In the drawings, reference numbers labelelements of the present embodiments. These reference numbers arereproduced below in connection with the discussion of the correspondingdrawing features.

The embodiments of the present dynamic identification of the threatlevel associated with a person using audio/video (A/V) recording andcommunication devices are described below with reference to the figures.These figures, and their written descriptions, indicate that certaincomponents of the apparatus are formed integrally, and certain othercomponents are formed as separate pieces. Those of ordinary skill in theart will appreciate that components shown and described herein as beingformed integrally may in alternative embodiments be formed as separatepieces. Those of ordinary skill in the art will further appreciate thatcomponents shown and described herein as being formed as separate piecesmay in alternative embodiments be formed integrally. Further, as usedherein the term integral describes a single unitary piece.

With reference to FIG. 1, the present embodiments include an audio/video(A/V) recording and communication device 100 (e.g., a video doorbell, asecurity camera, etc.). While the present disclosure provides numerousexamples of methods and systems including A/V recording andcommunication doorbells, the present embodiments are equally applicablefor A/V recording and communication devices other than doorbells. Forexample, the present embodiments may include one or more A/V recordingand communication security cameras instead of, or in addition to, one ormore A/V recording and communication doorbells. An example A/V recordingand communication security camera, as described below with reference toFIG. 8, may include substantially all of the structure and functionalityof the doorbells described herein, but without the front button and itsrelated components.

The A/V recording and communication device 100 may be located near theentrance to a structure (not shown), such as a dwelling, a business, astorage facility, etc. The A/V recording and communication device 100includes a camera 102, a microphone 104, and a speaker 106. The camera102 may comprise, for example, a high definition (HD) video camera, suchas one capable of capturing video images at an image display resolutionof 720p, or 1080p, or better. While not shown, the A/V recording andcommunication device 100 may also include other hardware and/orcomponents, such as a housing, one or more motion sensors (and/or othertypes of sensors), a button, etc.

Additionally, the present disclosure provides numerous examples ofmethods and systems including A/V recording and communication devicesthat are powered by a connection to AC mains, but the presentembodiments are equally applicable for A/V recording and communicationdevices that are battery powered. The A/V recording and communicationdevice 100 may further include similar componentry and/or functionalityas the wireless communication doorbells described in U.S. PatentApplication Publication Nos. 2015/0022620 (application Ser. No.14/499,828) and 2015/0022618 (application Ser. No. 14/334,922), both ofwhich are incorporated herein by reference in their entireties as iffully set forth.

With further reference to FIG. 1, the A/V recording and communicationdevice 100 communicates with a user's network 110, which may be forexample a wired and/or wireless network. If the user's network 110 iswireless, or includes a wireless component, the network 110 may be aWi-Fi network compatible with the IEEE 802.11 standard and/or otherwireless communication standard(s). The user's network 110 is connectedto another network 112, which may comprise, for example, the Internetand/or a public switched telephone network (PSTN). As described below,the A/V recording and communication device 100 may communicate with theuser's client device 114 via the network 110 and the network 112(Internet/PSTN). The user's client device 114 may comprise, for example,a mobile telephone (may also be referred to as a cellular telephone),such as a smartphone, a personal digital assistant (PDA), or anothercommunication device. The user's client device 114 comprises a display(not shown) and related components capable of displaying streamingand/or recorded video images. The user's client device 114 may alsocomprise a speaker and related components capable of broadcastingstreaming and/or recorded audio, and may also comprise a microphone.

The A/V recording and communication device 100 may also communicate withone or more remote storage device(s) 116 (may be referred tointerchangeably as “cloud storage device(s)”), one or more servers 118,and/or a backend API (application programming interface) 120 via thenetwork 110 (e.g., a personal wired or wireless network) and the network112 (e.g., Internet/PSTN). While FIG. 1 illustrates the storage device116, the server 118, and the backend API 120 as components separate fromthe network 112, it is to be understood that the storage device 116, theserver 118, and/or the backend API 120 may be considered to becomponents of the network 112.

The network 112 may be any wireless network or any wired network, or acombination thereof, configured to operatively couple theabove-mentioned modules, devices, and systems as shown in FIG. 1. Forexample, the network 112 may include one or more of the following: aPSTN (public switched telephone network), the Internet, a localintranet, a PAN (Personal Area Network), a LAN (Local Area Network), aWAN (Wide Area Network), a MAN (Metropolitan Area Network), a virtualprivate network (VPN), a storage area network (SAN), a frame relayconnection, an Advanced Intelligent Network (AIN) connection, asynchronous optical network (SONET) connection, a digital T1, T3, E1 orE3 line, a Digital Data Service (DDS) connection, a DSL (DigitalSubscriber Line) connection, an Ethernet connection, an ISDN (IntegratedServices Digital Network) line, a dial-up port such as a V.90, V.34, orV.34bis analog modem connection, a cable modem, an ATM (AsynchronousTransfer Mode) connection, or an FDDI (Fiber Distributed Data Interface)or CDDI (Copper Distributed Data Interface) connection.

Furthermore, communications may also include links to any of a varietyof wireless networks, including WAP (Wireless Application Protocol),GPRS (General Packet Radio Service), GSM (Global System for MobileCommunication), LTE, VoLTE, LoRaWAN, LPWAN, RPMA, LTE, Cat-“X” (e.g. LTECat 1, LTE Cat 0, LTE CatM1, LTE Cat NB1), CDMA (Code Division MultipleAccess), TDMA (Time Division Multiple Access), FDMA (Frequency DivisionMultiple Access), and/or OFDMA (Orthogonal Frequency Division MultipleAccess) cellular phone networks, GPS, CDPD (cellular digital packetdata), RIM (Research in Motion, Limited) duplex paging network,Bluetooth radio, or an IEEE 802.11-based radio frequency network. Thenetwork can further include or interface with any one or more of thefollowing: RS-232 serial connection, IEEE-1394 (Firewire) connection,Fibre Channel connection, IrDA (infrared) port, SCSI (Small ComputerSystems Interface) connection, USB (Universal Serial Bus) connection, orother wired or wireless, digital or analog, interface or connection,mesh or Digi® networking.

The user's network 110 is also connected to one or more alert devicessuch as the in-home alert device 143. The alert device 143 comprises adevice that is capable of providing audible and/or visual alerts. Insome aspects of the present embodiments, the alert device 143 maycomprise one or more colored light bulbs that are capable of emittinglight in different colors (e.g., RGB color changing LED lights such assmart LED bulbs). The alert device 143 may also comprise one or morespeakers that are capable of generating different sounds and/or verbalwarnings. Some of the present embodiments may include a combination ofcolored lights and speakers as in-home alert devices. In yet otherembodiments, the alert device 143 can be any other type of device thatis capable of generating visual and/or audible alerts.

According to one or more aspects of the present embodiments, when aperson (may be referred to interchangeably as “visitor”) arrives at theA/V recording and communication device 100, the A/V recording andcommunication device 100 detects the visitor's presence and beginscapturing video images within a field of view of the camera 102. The A/Vcommunication device 100 may also capture audio through the microphone104. The A/V recording and communication device 100 may detect thevisitor's presence using a motion sensor, and/or by detecting that thevisitor has depressed the button (e.g., a doorbell button) on the A/Vrecording and communication device 100.

In response to the detection of the visitor, the A/V recording andcommunication device 100 sends an alert to the user's client device 114(FIG. 1) via the user's network 110 and the network 112. The A/Vrecording and communication device 100 also sends streaming video, andmay also send streaming audio, to the user's client device 114. If theuser answers the alert, two-way audio communication may then occurbetween the visitor and the user through the A/V recording andcommunication device 100 and the user's client device 114. The user mayview the visitor throughout the duration of the call, but the visitorcannot see the user (unless the A/V recording and communication device100 includes a display, which it may in some embodiments).

In some aspects of the present embodiments, instead of, or inconjunction with, the above-described alert, a different type of alertmay be sent to the client device 114 (e.g., generating a different typeof audible and/or visual notification compared to a typical alert). Thedifferent alert may provide the user with a threat level associated withthe visitor. For instance, when the visitor is determined to be asuspicious person, then instead of, or in conjunction with, a typicalalert, a second, different type of alert (e.g., a loud noise, flashingthe screen, or any other type of warning notification) may be sent tothe client device 114 in some of the present embodiments. Additionally,in some of the present embodiments, a visual and/or verbal notificationabout the level of the threat associated with the visitor may beprovided to any persons present at the property (e.g., by activating oneor more smart LED bulbs inside a structure at the property, where theone or more smart LED bulbs are capable of emitting differently coloredlights based on different levels of the threat, or verbally warning thepersons present at the property using one or more speakers installedinside the property, etc.).

In some instances, the identified visitor may not pose any threat atall. For example, the identified visitor may be a parcel carrier (e.g.,USPS, UPS, FedEx, etc.). Some of the present embodiments may assign avalue (e.g., zero) to the threat level associated with a person who doesnot pose a threat, such as a parcel carrier or a neighbor, and provide anotification that corresponds to such a threat level, or absence ofthreat, (e.g., a light emitting a color associated with safety, such asgreen). Similarly, when the visitor is not identifiable, some of thepresent embodiments may assign a value to the threat level associatedwith the unidentified person to indicate that the visitor could not berecognized and provide a corresponding notification (e.g., a lightemitting a color associated with caution, such as yellow).

The video images captured by the camera 102 of the A/V recording andcommunication device 100 (and the audio captured by the microphone 104)may be uploaded to the cloud and recorded on the remote storage device116 (FIG. 1). In some embodiments, the video and/or audio may berecorded on the remote storage device 116 even if the user chooses toignore the alert sent to his or her client device 114.

With further reference to FIG. 1, the system may further comprise abackend API 120 including one or more components. A backend API(application programming interface) may comprise, for example, a server(e.g., a real server, or a virtual machine, or a machine running in acloud infrastructure as a service), or multiple servers networkedtogether, exposing at least one API to client(s) accessing it. Theseservers may include components such as application servers (e.g.,software servers), depending upon what other components are included,such as a caching layer, or database layers, or other components. Abackend API may, for example, comprise many such applications, each ofwhich communicate with one another using their public APIs. In someembodiments, the backend API may hold the bulk of the user data andoffer the user management capabilities, leaving the clients to have verylimited state.

As an example, in some of the present embodiments, one or more APIservers may receive (e.g., from the A/V recording and communicationdevice 100) captured images and/or biometric data of a person at anentry of a property and use the received images/data to determinewhether the person poses a threat or not. One or more of these backendservers may employ a set of computer vision processes (e.g., facerecognition, iris recognition, or any other biometrics recognitionprocess) on one or more databases (e.g., a database for convictedfelons, registered sex offenders, etc.) to recognize and report theseverity of the threat (e.g., the threat level associated with theperson).

The system 100 may further include a smart-home hub device (not shown)connected to the Network (Internet/PSTN) 112 via the user's network 110.A smart-home hub (also known as a home automation hub) device maycomprise any device that facilitates communication with and control ofone or more second devices, such as, but not limited to the in-homealert device 143, and/or the first A/V recording and communicationdevice 100. For example, the smart-home hub device may be a component ofa home security system and/or a home automation system (may be acombined home security/automation system). Where the smart-home deviceis a component of a home security system, the smart-home hub device maybe a premises security system hub device. In some embodiments, thesmart-home hub device may receive (e.g., from the A/V recording andcommunication device 100) captured images and/or biometric data of aperson at an entry of a property and use the received images/data todetermine whether the person poses a threat or not. Further, thesmart-home hub device, instead of or in addition to the backend servers,may employ the set of computer vision processes (e.g., face recognition,iris recognition, or any other biometrics recognition process) on one ormore databases (e.g., a database for convicted felons, registered sexoffenders, etc.) to recognize and report the severity of the threat(e.g., the threat level associated with the person). In someembodiments, the smart-home hub device, may perform all or any portionof the processes performed by the backend devices, such as the backendserver 118, in dynamically identifying a threat level associated with aperson, as described herein.

The backend API 120 illustrated in FIG. 1 may include one or more APIs.An API is a set of routines, protocols, and tools for building softwareand applications. An API expresses a software component in terms of itsoperations, inputs, outputs, and underlying types, definingfunctionalities that are independent of their respectiveimplementations, which allows definitions and implementations to varywithout compromising the interface. Advantageously, an API may provide aprogrammer with access to an application's functionality without theprogrammer needing to modify the application itself, or even understandhow the application works. An API may be for a web-based system, anoperating system, or a database system, and it provides facilities todevelop applications for that system using a given programming language.In addition to accessing databases or computer hardware like hard diskdrives or video cards, an API can ease the work of programming GUIcomponents. For example, an API can facilitate integration of newfeatures into existing applications (a so-called “plug-in API”). An APIcan also assist otherwise distinct applications with sharing data, whichcan help to integrate and enhance the functionalities of theapplications.

The backend API 120 illustrated in FIG. 1 may further include one ormore services (also referred to as network services). A network serviceis an application that provides data storage, manipulation,presentation, communication, and/or other capability. Network servicesare often implemented using a client-server architecture based onapplication-layer network protocols. Each service may be provided by aserver component running on one or more computers (such as a dedicatedserver computer offering multiple services) and accessed via a networkby client components running on other devices. However, the client andserver components can both be run on the same machine. Clients andservers may have a user interface, and sometimes other hardwareassociated with them.

As discussed above, there is a significant need to identify visitors at,or near, a property dynamically (e.g., without human intervention) andto notify persons associated with the property (e.g., owners, residents,occupants, guests, etc.) about the severity of the threat posed by thevisitor. It would be advantageous, therefore, if the functionality ofA/V recording and communication devices (e.g., A/V doorbells, A/Vsecurity cameras, etc.) could be leveraged to identify the visitor,determine a level of threat associated with the visitor, and notifypeople at the property (e.g., through in-home alert devices) and/orother authorized users remote from the property (e.g., through one ormore client devices). The present embodiments provide these advantagesand enhancements, as described below.

In some embodiments, the threat assessment may be performed with respectto an object instead of, or in addition to, a visitor. For example, avisitor approaching a property may be carrying an object, and the threatassessment may include an analysis of the carried object to determine ifit is a weapon or any other type of object that may be dangerous and/orthreatening. In another example, an object may be placed on or near aproperty, and the threat assessment may include an analysis of theobject to determine if it is a bomb or any other type of object that maybe dangerous and/or threatening.

For example, some of the present embodiments may identify one or morevisitors by receiving image data of the visitor(s) within a field ofview of the camera of the A/V recording and communication device. Upondetermining the threat level associated with each visitor (or upondetermining that the threat level associated with a given visitor cannotbe ascertained), some aspects of the present embodiments may provide anotification about the identified visitor and the level of threat thevisitor poses. As an example, one aspect of the present embodimentsturns the color of the light(s) (e.g., at least one light) inside astructure, such as a house at which the A/V recording and communicationdevice is installed, to (i) a first color (e.g., green) when a visitorat the front door is a trusted person (e.g., a family member or afriend) known to a person associated with the property, (ii) a secondcolor (e.g., yellow) when the visitor is not known to the personassociated with the property and/or could not be identified, (iii) athird color (e.g., red) when the visitor is a known criminal and/or aknown threat (e.g., a hostile neighbor), (iv) a fourth color (e.g.,orange) when the visitor could not be identified but the visitor isengaged in a suspicious activity, and (v) a fifth color (e.g., blue)when the visitor is someone not personally known to the personassociated with the property but is nevertheless unlikely to pose athreat (e.g., a parcel carrier).

In various embodiments, the aspects described above (e.g., detecting avisitor, capturing video images of the visitor, identifying the visitor,determining the threat level posed by the visitor, and notifying atleast one person associated with the property (e.g., using in-home alertdevice(s) and/or client device(s))) can be performed either entirely bythe A/V recording and communication device, or by the A/V recording andcommunication device in conjunction with one or more backend devices(e.g., servers) using one or more backend processors, one or moredatabases, and/or one or more networks enabling communication betweenthe devices in the described system.

Some of the present embodiments may comprise computer vision for one ormore aspects, such as recognition of persons and/or objects. Computervision includes methods for acquiring, processing, analyzing, andunderstanding images and, in general, high-dimensional data from thereal world in order to produce numerical or symbolic information, e.g.,in the form of decisions. Computer vision seeks to duplicate theabilities of human vision by electronically perceiving and understandingan image. Understanding in this context means the transformation ofvisual images (the input of the retina) into descriptions of the worldthat can interface with other thought processes and elicit appropriateaction. This image understanding can be seen as the disentangling ofsymbolic information from image data using models constructed with theaid of geometry, physics, statistics, and learning theory. Computervision has also been described as the enterprise of automating andintegrating a wide range of processes and representations for visionperception. As a scientific discipline, computer vision is concernedwith the theory behind artificial systems that extract information fromimages. The image data can take many forms, such as video sequences,views from multiple cameras, or multi-dimensional data from a scanner.As a technological discipline, computer vision seeks to apply itstheories and models for the construction of computer vision systems.

One aspect of computer vision comprises determining whether or not theimage data contains some specific object, feature, or activity.Different varieties of computer vision recognition include: ObjectRecognition (also called object classification)—One or severalpre-specified or learned objects or object classes can be recognized,usually together with their 2D positions in the image or 3D poses in thescene. Identification—An individual instance of a person or an object isrecognized. Examples include identification of a specific person's faceor fingerprint, identification of handwritten digits, or identificationof a specific vehicle. Detection—The image data are scanned for aspecific condition. Examples include detection of possible abnormalcells or tissues in medical images or detection of a vehicle in anautomatic road toll system. Detection based on relatively simple andfast computations is sometimes used for finding smaller regions ofinteresting image data that can be further analyzed by morecomputationally demanding techniques to produce a correctinterpretation.

Several specialized tasks based on computer vision recognition exist,such as: Facial recognition, and shape recognition technology(SRT)—differentiating human beings (e.g., head and shoulder patterns)from objects.

Typical functions and components (e.g., hardware) found in many computervision systems are described in the following paragraphs. The presentembodiments may include at least some of these aspects. For example,with reference to FIG. 12, embodiments of the present A/V recording andcommunication device 130 may include a computer vision module 189. Thecomputer vision module 189 may include any of the components (e.g.,hardware) and/or functionality described herein with respect to computervision, including, without limitation, one or more cameras, sensors,and/or processors. In some embodiments, the microphone 158, the imager171, and/or the camera processor 170 may be components of the computervision module 189.

Image acquisition—A digital image is produced by one or several imagesensors, which, besides various types of light-sensitive cameras, mayinclude range sensors, tomography devices, radar, ultra-sonic cameras,etc. Depending on the type of sensor, the resulting image data may be a2D image, a 3D image, or an image sequence. The pixel values maycorrespond to light intensity in one or several spectral bands (grayimages or color images), but can also be related to various physicalmeasures, such as depth, absorption or reflectance of sonic orelectromagnetic waves, or nuclear magnetic resonance.

Pre-processing—Before a computer vision method can be applied to imagedata in order to extract some specific piece of information, it isusually beneficial to process the data in order to assure that itsatisfies certain assumptions implied by the method. Examples ofpre-processing include, but are not limited to re-sampling in order toassure that the image coordinate system is correct, noise reduction inorder to assure that sensor noise does not introduce false information,contrast enhancement to assure that relevant information can bedetected, and scale space representation to enhance image structures atlocally appropriate scales.

Feature extraction—Image features at various levels of complexity areextracted from the image data. Typical examples of such features are:Lines, edges, and ridges; Localized interest points such as corners,blobs, or points; More complex features may be related to texture,shape, or motion.

Detection/segmentation—At some point in the processing a decision may bemade about which image points or regions of the image are relevant forfurther processing. Examples are: Selection of a specific set ofinterest points; Segmentation of one or multiple image regions thatcontain a specific object of interest; Segmentation of the image intonested scene architecture comprising foreground, object groups, singleobjects, or salient object parts (also referred to as spatial-taxonscene hierarchy).

High-level processing—At this step, the input may be a small set ofdata, for example a set of points or an image region that is assumed tocontain a specific object. The remaining processing may comprise, forexample: Verification that the data satisfy model-based andapplication-specific assumptions; Estimation of application-specificparameters, such as object pose or object size; Imagerecognition—classifying a detected object into different categories;Image registration—comparing and combining two different views of thesame object. Decision making—Making the final decision required for theapplication, for example match/no-match in recognition applications.

One or more of the present embodiments may include a vision processingunit (not shown separately, but may be a component of the computervision module 189). A vision processing unit is an emerging class ofmicroprocessor; it is a specific type of AI (artificial intelligence)accelerator designed to accelerate machine vision tasks. Visionprocessing units are distinct from video processing units (which arespecialized for video encoding and decoding) in their suitability forrunning machine vision algorithms such as convolutional neural networks,SIFT, etc. Vision processing units may include direct interfaces to takedata from cameras (bypassing any off-chip buffers), and may have agreater emphasis on on-chip dataflow between many parallel executionunits with scratchpad memory, like a manycore DSP (digital signalprocessor). But, like video processing units, vision processing unitsmay have a focus on low precision fixed point arithmetic for imageprocessing.

Some of the present embodiments may use facial recognition hardwareand/or software, as a part of the computer vision system. Various typesof facial recognition exist, some or all of which may be used in thepresent embodiments.

Some face recognition algorithms identify facial features by extractinglandmarks, or features, from an image of the subject's face. Forexample, an algorithm may analyze the relative position, size, and/orshape of the eyes, nose, cheekbones, and jaw. These features are thenused to search for other images with matching features. Other algorithmsnormalize a gallery of face images and then compress the face data, onlysaving the data in the image that is useful for face recognition. Aprobe image is then compared with the face data. One of the earliestsuccessful systems is based on template matching techniques applied to aset of salient facial features, providing a sort of compressed facerepresentation.

Recognition algorithms can be divided into two main approaches,geometric, which looks at distinguishing features, or photometric, whichis a statistical approach that distills an image into values andcompares the values with templates to eliminate variances.

Popular recognition algorithms include principal component analysisusing eigenfaces, linear discriminant analysis, elastic bunch graphmatching using the Fisherface algorithm, the hidden Markov model, themultilinear subspace learning using tensor representation, and theneuronal motivated dynamic link matching.

Further, a newly emerging trend, claimed to achieve improved accuracy,is three-dimensional face recognition. This technique uses 3D sensors tocapture information about the shape of a face. This information is thenused to identify distinctive features on the surface of a face, such asthe contour of the eye sockets, nose, and chin.

One advantage of 3D face recognition is that it is not affected bychanges in lighting like other techniques. It can also identify a facefrom a range of viewing angles, including a profile view.Three-dimensional data points from a face vastly improve the precisionof face recognition. 3D research is enhanced by the development ofsophisticated sensors that do a better job of capturing 3D face imagery.The sensors work by projecting structured light onto the face. Up to adozen or more of these image sensors can be placed on the same CMOSchip—each sensor captures a different part of the spectrum.

Another variation is to capture a 3D picture by using three trackingcameras that point at different angles; one camera pointing at the frontof the subject, a second one to the side, and a third one at an angle.All these cameras work together to track a subject's face in real timeand be able to face detect and recognize.

Another emerging trend uses the visual details of the skin, as capturedin standard digital or scanned images. This technique, called skintexture analysis, turns the unique lines, patterns, and spots apparentin a person's skin into a mathematical space.

Another form of taking input data for face recognition is by usingthermal cameras, which may only detect the shape of the head and ignorethe subject accessories such as glasses, hats, or make up.

Further examples of automatic identification and data capture (AIDC)and/or computer vision that can be used in the present embodiments toverify the identity and/or authorization of a person include, withoutlimitation, biometrics. Biometrics refers to metrics related to humancharacteristics. Biometrics authentication (or realistic authentication)is used in various forms of identification and access control. Biometricidentifiers are the distinctive, measurable characteristics used tolabel and describe individuals. Biometric identifiers can bephysiological characteristics and/or behavioral characteristics.Physiological characteristics may be related to the shape of the body.Examples include, but are not limited to, fingerprints, palm veins,facial recognition, three-dimensional facial recognition, skin textureanalysis, DNA, palm prints, hand geometry, iris recognition, retinarecognition, and odor/scent recognition. Behavioral characteristics maybe related to the pattern of behavior of a person, including, but notlimited to, typing rhythm, gait, and voice recognition.

The present embodiments may use any one, or any combination of more thanone, of the foregoing biometrics to identify and/or authenticate aperson who is either suspicious or who is authorized to take certainactions with respect to a property or expensive item of collateral. Forexample, the computer vision module 169, and/or the camera 134 and/orthe processor may receive information about the person using any one, orany combination of more than one, of the foregoing biometrics.

FIG. 2 illustrates an example of notifying a person at a property abouta visitor at the entrance to a structure on the property and the threat(if any) the visitor might pose, according to some aspects of thepresent embodiments. The present embodiments are not limited tonotifying any particular person or class of persons. The person notifiedabout the visitor may be, for example, the owner of the property, aresident of the property, an occupant of the property, any personpresent at the property when the visitor is also present, any authorizeduser of the A/V recording and communication device (even if theauthorized user is not present at the property when the visitor is alsopresent), or any other person. Examples of the present embodiments maybe described herein with respect to a particular person or class ofpersons, but any such examples should not be construed as limiting thepresent embodiments to notifying any particular person(s) or class ofpersons to the exclusion of notifying any other person(s) or class ofpersons.

With reference to FIG. 2, the present embodiments comprise an A/Vrecording and communication device 220 (e.g., a video doorbell, asecurity camera, etc.) detecting a person 205 (may also be referred toas “visitor”) standing near an outside door 245 of a structure, such asa house 200, and providing an alert to a person 210 (may also bereferred to as “resident”) watching TV inside the house 200, wherein thealert also provides the person 210 with information about the severityof a threat the visitor 205 might pose to the person 210. As shown, thehouse is equipped with a set of smart LED light bulbs 230, 240 (e.g.,one light bulb in each room) that are capable of emitting light indifferent colors. While FIG. 2 shows a house 200, the presentembodiments are not limited to houses. Rather, the present embodimentsare applicable to any type of property and/or structure, includingwithout limitation houses, apartments, offices, businesses, storagefacilities, etc. In fact, certain of the present embodiments areapplicable to broader environments, such as neighborhoods, as furtherdescribed below.

With reference to FIG. 2, when the A/V recording and communicationdevice 220 detects the visitor 205's presence, the device 220 capturesvideo images of persons and/or objects that are within a field of viewof the camera 225 of the A/V recording and communication device 220. TheA/V recording and communication device 220 may also capture audiothrough the device's microphone. As described above, the A/V recordingand communication device 220 may detect the visitor 205's presence bydetecting motion using its camera 225 and/or one or more motion sensors.The A/V recording and communication device 220 may also detect thevisitor 205's presence when the visitor 205 presses a front button ofthe A/V recording and communication device 220 (e.g., when the A/Vrecording and communication device 220 is a video doorbell).

As soon as the visitor's presence is detected (through any of theabove-mentioned methods), the A/V recording and communication device 220may send a notification (along with streaming video and, in someembodiments, audio) to a client device as described below with referenceto FIGS. 9 and 10. Various aspects of the present embodiments may alsonotify any persons inside the property of a threat level associated withthe detected visitor 205. For example, some aspects of the presentembodiments, instead of, or in conjunction with, a notification sent toone or more client devices, may provide a different type of alert thatis indicative of a threat level associated with the visitor.

For instance, when the visitor is determined to be a criminal, insteadof, or in conjunction with, a regular notification (e.g., a messagealong with A/V streaming sent to the client device), a second, differenttype of alert (e.g., a loud noise, screen flashing, or any other type ofwarning notification) may be sent to the client device in some of thepresent embodiments. Additionally, in some of the present embodiments, avisual and/or audible notification about the level of the threatassociated with the visitor 205 may be sent to any persons present atthe property (e.g., by activating one or more smart LED lights such asthe LED lights 230 and 240, by verbally warning the residents using oneor more speakers installed inside the property, etc.). When theidentified visitor 205 does not pose any threat (e.g. the identifiedperson is a family member), some of the present embodiments may assign aparticular value (e.g., zero) to the threat level associated with anon-threatening visitor and provide a notification that corresponds tosuch a threat level (e.g., the LED lights 230, 240 turn green). When thevisitor cannot be identified, some aspects of the present embodimentsmay assign a different value to the threat level associated with theunidentified person to indicate that the visitor could not be recognizedand provide a corresponding notification (e.g., the LED lights 230, 240turn yellow).

The smart LED lights 230, 240 are merely one example of a notificationdevice (e.g., the in-home alert device 143 shown in FIG. 1) that can beused in connection with the present embodiments to provide anotification or warning to any persons inside the structure 200 of thethreat level associated with the visitor 205. Other examples of thein-home alert device 143 include discrete devices that may be locatedanywhere throughout the structure 200, such as devices that may beplaced on tabletops or shelves, and which may include different modes ofproviding notifications, such as display screens, multi-colored lights,speakers for audio notifications, etc. Any of these notificationdevices, including the smart LED lights 230, 240, may be configured tocommunicate with other devices through wired and/or wireless connectionsthrough the user's network 110 (FIG. 1) and/or the network 112, and/orthrough direct communication with other devices using one or moreshort-range communication protocols, such as Bluetooth, Bluetooth lowenergy (LE), ANT, ANT+, ZigBee, etc.

In order to identify a visitor and determine a threat level associatedwith the visitor, the A/V recording and communication device 220 of someof the present embodiments may send (e.g., through wired and/or wirelessnetworks) the visitor's identification data (e.g., a set of images takenof the visitor) to one or more backend devices and/or services (e.g.,backend servers). The servers, in turn, may identify the visitor (e.g.,using one of the above-described AIDC or computer vision methods) andassign a threat level to the visitor using one or more databases (e.g.,databases of authorized visitors, criminals, suspicious persons, etc.).For example, with reference to FIGS. 1 and 12, information received bythe computer vision module 189 of the A/V recording and communicationdevice 220 may be sent to one or more network devices, such as theserver 118 and/or the backend API 120 (e.g., in a computer vision querysignal) to query about the threat level associated with a visitor. Insome aspects of the present embodiments, however, the A/V recording andcommunication device 220 may make such a determination itself andwithout exchanging identification data with backend devices. In yetother aspects of the present embodiments, the identification of thevisitor and the threat level associated with the visitor may bedetermined by a combination of the A/V recording and communicationdevice 220 and one or more backend devices.

In some aspects of the present embodiments, the information sent to thebackend devices and/or services may be compared with other informationstored in one or more databases to determine whether there is a match.For example, one or more images (and/or other biometric data) of thevisitor may be compared with photos and/or images (and/or otherbiometric data) of known suspicious persons, criminals, etc. If there isa match, a level of threat may be retrieved from the databases, orassigned by the servers based on which database contained a match forthe visitor. For example, if the visitor is matched against a criminalor suspicious persons' database, the level of threat assigned to thevisitor may be set to the highest level. Conversely, when the visitor ismatched against a known and authorized persons' database, the level ofthreat assigned to the visitor may be set to the lowest level. When thevisitor cannot be identified (e.g., cannot be matched against any of thedatabases), an unknown visitor status (or threat level) may be assignedto the visitor in some embodiments. The databases described above, andelsewhere herein, are merely examples, and should not be construed aslimiting. In some embodiments, information about visitors may beretrieved from a single database, or from a plurality of databases otherthan those described herein.

The databases may contain as much information as possible about eachknown suspicious person, criminal, etc., such as their facial featuresor characteristics, name, aliases, and/or criminal history. However, thedatabases may also contain as little information as an image of the faceof a known suspicious person, criminal, etc., even if that person isotherwise unidentified by name or other typical identifying information.In some embodiments, the database(s) of known suspicious persons,criminal, etc. may be one or more databases of convicted felons and/orregistered sex offenders. In other embodiments, the database of knownsuspicious persons may be modified by the user, such as through theclient device. Specifically, the user may, upon review of stored imagesof visitors, flag a particular stored image of a visitor as suspiciousor threatening. This image may then be uploaded into the database. Thisflagging function can further be notated by the user as a “public”suspicious or threatening person, who might be exhibiting suspicious orthreatening behavior as to an entire neighborhood, such as, for example,a suspicious or threatening person that the user saw breaking aneighbor's windows, or it can be notated by the user as a “private”suspicious or threatening person, such as, for example a hostileco-worker whose presence may be suspicious or threatening with respectto the user's home, but not to the public at large.

Additionally, a user may upload one or more images of persons that theuser considers suspicious or threatening into the database, from sourcesother than those captured by the A/V recording and communication device220, e.g., from the user's smartphone camera. This example embodimentallows for the user to receive alerts about persons that are suspiciousor threatening to the user, for example, an ex-spouse, a hostileco-worker, a hostile neighbor, etc., but who are not otherwise known tobe suspicious or threatening to society at large. Furthermore, in someembodiments, a crime(s) and/or suspicious event(s) may have beenrecorded by an A/V recording and communication device other than theones associated with the owner/occupant of the property. For example,another user of an A/V recording and communication device may view videofootage that was recorded by his or her device and determine that theperson or persons in the video footage are, or may be, engaged insuspicious or threatening activity and/or criminal activity. The otheruser may then share that video footage with one or more other people,such as other users of A/V recording and communication devices, and/orone or more organizations, including one or more law enforcementagencies. The present embodiments may leverage this shared video footagefor use in comparing with the information in the computer vision queryto determine whether a person detected in the area about the A/Vrecording and communication device 220 is the same person that was thesubject of (and/or depicted in) the shared video footage.

After assigning a threat level value to the visitor 205, the networkdevice(s), such as the server 118 and/or the backend API 120 (FIG. 1),may send a computer vision response signal to the A/V recording andcommunication device 220, which may contain the threat level assigned tothe visitor. After receiving this signal, the A/V recording andcommunication device 220 may translate the threat level value assignedto the visitor 205 to a particular color of light to be emitted by theLED lights 230 and 240 (FIG. 2) inside the structure 200 (if the in-homealert device is a smart LED light). In some aspects of the presentembodiments, the alert signals to the in-home alert devices may also besent by the backend servers. That is, not only do the backend servers ofsome embodiments determine the threat level associated with a visitor,but also the servers themselves may send a threat level signal toactivate the in-home alert devices directly, such as via the networks110, 112, instead of sending the threat level signal to the A/Vrecording and communication device, which then relays the threat levelsignal to the in-home alert device(s).

In addition to sending alert signals to in-home notification devices andother client devices, some aspects of the present embodiments may alsoprovide notifications that are directed to larger groups of people, suchas a neighborhood when an A/V recording and communication device detectsa threat at a property located in the neighborhood. That is, in some ofthe present embodiments, when a high-level threat is detected at a firstproperty in a neighborhood, in addition to notifying persons at thefirst property (and/or any remotely located authorized person(s) havinga client device), some embodiments may provide audible and/or visualnotifications to other persons in the neighborhood, including residentsand owners of other properties that are located within a certaindistance from the first property in the neighborhood. In one aspect ofthe present embodiments, the other neighbors (and/or any persons presentin the neighborhood) may be notified through the street lights installedat the neighborhood. For example, the street lights may be turned on,may start flashing, may emit different colors of light, or may providenotification through any other means that will draw the neighbors'attention. Some embodiments may provide additional audible or verbalnotification (e.g., through a set of speakers that are installed in theneighborhood), in addition to or instead of any visual notification.

With reference to the example illustrated in FIG. 2, after the presenceof the visitor 205 is detected by the A/V recording and communicationdevice 220 near the house 200, the device 220 may send one or moreimages of the visitor 205 to the backend servers in a query signal aboutthe identification of the detected person. The servers may determinethat the visitor 205 is a known and authorized person (e.g., a friend orfamily member) and send a response signal to the device 220 indicatingthat the visitor 205 is an authorized person. Subsequently, the A/Vrecording and communication device 220 may send a signal (e.g., throughthe wired and/or wireless network 110) to the LED light 240 to emit agreen light in the TV room. The resident 210 watching TV in this room isnotified of the presence of a known person (e.g., a friend) at the doorby observing the green light in the room (even before the visitor 205activates the doorbell 220).

In some aspects of the present embodiments, not every LED light insidethe house may be activated when a notification of threat level isprovided to the persons inside the structure 200. For example, in someembodiments, only the light(s) of the room (or rooms) that is/areoccupied may be activated. Some of the present embodiments may determinewhich rooms are occupied by employing a set of motion sensors installedin the house, through detection of the client devices carried by theoccupants, through other A/V recording and communication devicesinstalled inside the house 200, etc. In the illustrated example, the LEDlight 240 in the TV room is activated because a person 210 is presentthere, while the LED light 230 in the bedroom remains inactive becauseno person is present there. In some other embodiments, such as the onesdescribed below with reference to FIGS. 3 and 4, all of the lights maybe activated for notifying the occupants irrespective of which room orrooms are occupied.

The A/V recording and communication device 220 of some of the presentembodiments may recognize a suspicious activity conducted by a visitorand notify the owners and/or occupants of the property of a high levelof threat associated with the visitor regardless of the threat levelassigned to the visitor (e.g., through using the databases). That is, insome aspects of the present embodiments, when a person at, or near, aproperty engages in a suspicious activity, the A/V recording andcommunication device may send a “high level of threat” signal to thein-home alert devices (and/or to other client devices) even if the levelof threat associated with the visitor was not recognizable (e.g., thevisitor's identity could not be matched against any of the databases),or even when the visitor is determined to be associated with a lowerlevel of threat (e.g., visitor is known by the owner of the property).

One example of a suspicious activity that could cause the A/V recordingand communication device to send a “high level of threat” signal to thein-home alert devices and/or to other client devices is loitering.Loitering is often a prelude to a number of property and personalcrimes, such as burglary, vandalism, breaking-and-entering, homeinvasion robbery, etc. Loitering may be identified using the several ofthe present embodiments in a variety of ways. For example, in someembodiments, the A/V recording and communication device 220 isconfigured to record and save image data of all persons who enter thefield of view of the camera 225 to create saved visitor images. Thesesaved visitor images may then be automatically compared to the images ofsubsequent visitors within a certain period of time. Then, using thesaved visitor images and the image data from a new visitor, if it isdetermined that the visitor has entered the field of view of the cameramore than once within a predetermined period of time (may be referred toas a “suspicious loitering time warning value”), the process may set asuspicious person warning flag and/or generate and send alerts throughthe in-home alert devices and/or other client devices.

Some embodiments may identify loitering as a result of a prolongedpresence of the same person in the field of view of the camera 225 ofthe A/V recording and communication device 220. In other embodiments,the process may include determining whether the doorbell of the A/Vrecording and communication device 220 has been activated. Then, if thedoorbell has not been activated, and the suspicious loitering timewarning value has been exceeded, the process may set a warning flag. Inanother embodiment for identifying suspicious behavior, includingloitering, the process may employ two distinct A/V recording andcommunication devices. This method can advantageously identifysuspicious behavior, for example, in the form of a person firstapproaching the front door of a property and then the back door of thesame property.

Another form of suspicious behavior that can be identified by some ofthe present embodiments is carrying a suspicious object, such as aweapon or a burglary tool (e.g., a crowbar). The present embodimentsalso contemplate numerous methodologies for determining whether anobject carried by a person who is present within the camera's field ofview is a suspicious item, such as a weapon or burglary tool 255 (FIG.4). Any or all of these methodologies may include one or more aspects ofcomputer vision. For example, in some embodiments, received image dataof an object carried by a person within the camera's field of view maybe determined to be a suspicious item by using object recognitionsoftware to compare images received from the A/V recording andcommunication device 220 to a database of images of weapons and/orburglary tools and/or other types of suspicious objects. Upondetermining that a person is carrying a suspicious object or a weapon, asuspicious person warning flag may be set by some of the presentembodiments.

Another form of suspicious behavior is intentionally obscuring, orpartially obscuring, a visitor's face, so that it cannot be seen orrecognized by the A/V recording and communication device 220. Inembodiments of the present methods, the facial recognition software andthe object recognition software can be used to interact with oneanother, or to act alone, in order to determine, based on received imagedata of a person within the field of view of a camera 225 of the A/Vrecording and communication device 220, that the person has used anobject to obscure or partially obscure his or her face. When the processdetermines that a person is in the field of view of the camera, but thatthe person's face is obscured, or is obscured for some predeterminedperiod of time, or that the person's face is obscured at the time thatthe person activates the doorbell, a suspicious person warning flag maybe set by some embodiments.

Many more examples of suspicious activities, as well as methods ofdynamic recognition of suspicious persons and/or activities by an A/Vrecording and communication device, are provided in U.S. ProvisionalPatent Application Ser. No. 62/464,342, filed on Feb. 27, 2017, entitled“IDENTIFICATION OF SUSPICIOUS PERSONS USING AUDIO/VIDEO RECORDING ANDCOMMUNICATION DEVICES,” which is incorporated herein by reference in itsentirety as if fully set forth.

FIG. 3 illustrates an example of notifying persons at a property (and/orassociated with the property but not necessarily present at thepremises) about an object placed near the entrance of the property,according to some of the present embodiments. As discussed above, everynotification of a threat level is not necessarily associated with aperson or visitor near the property. Some aspects of the presentembodiments may provide a threat level notification (to residents of theproperty, to persons present at the property, to client devices, etc.)when an object (e.g., a package) is detected at, or near, the property.FIG. 3 includes the same house 200, entry point 245 (e.g., front door),and A/V recording and communication device 220 that are shown in FIG. 2.However, in FIG. 3, instead of a person being near the entrance of theproperty, a package 250 is placed at the property's front door 245 by adelivery service 260.

When the package 250 is left at the door 245 (e.g., by a deliveryman),an alert may be sent to the in-home alert devices 230 and 240 (inaddition to or instead of an alert sent to one or more client devicesassociated with the A/V recording and communication device 220). Some ofthe present embodiments may assign a neutral level of threat to objectsother than human beings when those objects are placed near a property.In some such embodiments, when the A/V recording and communicationdevice 220 receives a signal (e.g., from the backend servers) indicatingthat the object 250 is a package, the device 220 may activate the LEDlights 230 and 240 to emit a neutral color light, such as, for example,blue, and/or may send an alert to one or more client devices about thepresence of the package 250 at the door 245 of the house 200.

FIG. 4 illustrates an example of notifying an authorized personassociated with a property (e.g., a property owner or resident away fromthe property) about a threat level posed by a person at, or near, theproperty, according to some aspects of the present embodiments. As shownin FIG. 4, an intruder 215 holding a crowbar 255 is approaching thehouse 200 while a person 265 (e.g., the owner of the house 200) having aclient device 235, which is associated with the A/V recording andcommunication device 220, is away from the house 200. As describedabove, since a crowbar 255 can be determined to be a suspicious object,some of the present embodiments may send a high level of threat signalto both the in-home alert devices 230, 240 (to notify persons inside thehouse) and/or to any client device(s) 235 associated with the A/Vrecording and communication device 220 (whether inside the property oraway from the property).

As described above, a received image of an object carried by a personwithin the camera's field of view may be determined to be a suspiciousitem by using object recognition software to compare the received imagefrom the A/V recording and communication device 220 to one or moredatabase(s) of images of weapons and/or other types of suspiciousobjects, such as burglary tools. Upon determining that the intruder 215is carrying a suspicious object (such as the depicted crowbar 255) someof the present embodiments may activate a high-level-threat alertwithout attempting to determine (e.g., using other remote servers and/ordatabases) the identity of the person carrying the suspicious object. Inyet other aspects of the present embodiments, the A/V recording andcommunication device 220 may send a query signal to one or more backenddevices to attempt to identify the person and the threat levelassociated with the person carrying the suspicious object, irrespectiveof recognition of the suspicious object carried by the person.

As shown in FIG. 4, a serious threat level notification may be sent tothe persons within the structure through the LED lights 230 and 240inside the property by emitting a red light (or another color associatedwith danger). Simultaneously, a strong-threat notification may also besent to the client device 235, while the user 265 is away from the house200 (e.g., at work). The threat level notification may be sent to theclient device 235 through one or more networks such as the user network110 and the network 112 described above with reference to FIG. 1. Whenthe client device 235 receives a severe threat alert, depending on theconfiguration of the device 235, the device 235 may provide one or moreaudible and/or visual notifications to the user 265. For example, a loudnoise and/or a warning statement might be broadcast from the speaker(s)of the device 235 in some embodiments. In some other embodiments, adisplay screen of the device 235 may flash and/or one or more warningmessages or popups may appear (e.g., in red or other colors) on thedisplay screen of the device 235. In yet other embodiments, acombination of verbal and visual notifications may be provided to theuser 265.

FIG. 5 is a flowchart illustrating a process for detecting a personapproaching a property or moving around a property, determining a threatlevel associated with the person, and notifying one or more personspresent at the property and/or associated with the property about thethreat level, according to the present embodiments. In some of thepresent embodiments, this process may be performed by an A/V recordingand communication device such as the A/V recording and communicationdevice 220 shown in FIGS. 2-4.

At block 510, the process detects a person near a property and/or withina field of view of a camera of the A/V recording and communicationdevice. As described above, the process may detect the visitor'spresence by detecting motion using the camera and/or one or more motionsensors of the A/V recording and communication device. The process mayalso detect the visitor's presence when the visitor presses a doorbellbutton of the A/V recording and communication device (e.g., if the A/Vrecording and communication device is a video doorbell). As soon as thevisitor's presence is detected, the process may send, at block 520,identification data to one or more servers that may be capable ofidentifying and/or assigning a threat level to the person. In one aspectof the present embodiments, the process may transmit one or more imagesto the servers to be used to recognize the person's face.

At block 530, the process receives a threat level alert back from theserver(s) with an indication of the level of the threat assigned to theperson. After receiving the alert, at block 540, the process translatesthe received alert to audio and/or visual notifications about the levelof the threat associated with the person. For example, the process mayactivate one or more smart LED lights inside the property. These LEDlights are capable of emitting different colors of light based on theseverity of the threat the person may pose (e.g., a green color forauthorized visitors, a red color for intruders, a yellow color forunidentified persons, etc.). Additionally, the process may provideaudible notifications. For example, in some embodiments, thenotification may comprise an audible alarm emitted from a speaker of theA/V recording and communication device and/or one or more speakersinstalled inside the house. The audible alarm may be any loud noiselikely to attract attention when the person is determined to besuspicious. In some aspects of the present embodiments, a verbalnotification corresponding to the level of threat may be provided to theresidents. For example, the verbal notification may inform the residentsof the house about a friend, an unidentified person, a criminal, or asuspicious person being at the door of the house.

In some aspects of the present embodiments, the backend server itselfmay provide the audible and/or visual notification(s) (e.g., the servermay activate the light(s) inside the structure directly and withoutintervention of the A/V recording and communication device). Further, insome aspects of the present embodiments, the process may also provide auser, associated with a client device, with different audible and visualnotifications. That is, the process may transmit an alert to a clientdevice associated with the A/V recording and communication device. Forexample, the alert may be transmitted from the A/V recording andcommunication device to the user's client device via the user's network110 and/or the network 112.

The alert may include streaming video images of the person(s) whowas/were determined to have been suspicious. The user can then determinewhether to take further action, such as alerting law enforcement and/orsharing the video footage with other people, such as via social media.The process of some embodiments may also provide further visual and/oraudible alerts to draw the user's attention to the client device (e.g.,emitting a loud noise from the speaker of the client device, flashingthe display screen of the client device, etc.).

FIG. 6 is a flowchart illustrating a process for receivingidentification data about a person at, or near, a property anddetermining a threat level associated with the person, according to thepresent embodiments. In some of the present embodiments, the processdescribed with reference to FIG. 6 may be performed by one or morebackend devices (e.g., backend APIs and/or servers). The processinitiates by receiving (at block 610) identification data associatedwith a person. As described above, the person might be a visitor at, ornear, a property. The process may receive the identification data forthe person from an A/V recording and communication device, such as avideo doorbell or a security camera. For example, in some aspects of thepresent embodiments, block 610 is the next operation after block 520with reference to FIG. 5 described above. That is, the process mayreceive one or more video images of the person from an A/V recording andcommunication device after the A/V recording and communication devicedetects the person and captures images of the person.

After receiving the identification data, at block 620, the processperforms a computer vision process on the received identification datato identify the person. As described above, the computer vision processmay include face recognition, and/or any other biometrics recognitionprocess. For example, in a face recognition process, the face of theperson is compared with different databases of known persons, authorizedpersons, criminals, etc., as described above, in order to assign athreat level to the person. At block 630, the process determines whetherthe person was identified through the computer vision process (e.g.,whether a match for the received identification data was found in any ofthe databases). When the process determines that no match was found, theprocess of some embodiments assigns (at block 660) an “unidentified”threat level to the person.

However, when the process is able to match the person against one ormore of the databases, the process of some embodiments assigns (at block640) a threat level to the identified person that may correspond to thedatabase in which the person is found. For example, if the person isfound in a convicted felons database, a high level of threat may beassigned to the person. At block 650, the process returns an alert backto the A/V recording and communication device, wherein the alertincludes the assigned threat level associated with the detected person.

Many of the present embodiments have been described with reference topersons detected by, or present in the area about, the A/V recording andcommunication device 130. The present embodiments are not limited,however, to scenarios involving humans. For example, the presentembodiments contemplate that suspicious behavior may be committed by abot or drone. In some instances, the mere presence of a bot or dronewill be identified as suspicious, in other instances, loitering by adrone will be identified as suspicious.

FIG. 7 is a front view of an A/V recording and communication doorbellaccording to an aspect of the present disclosure. FIG. 7 illustratesthat the front of the video doorbell 130 includes a front button 133, afaceplate 135, and a light pipe 136. The button 133 may make contactwith a button actuator (not shown) located within the doorbell 130 whenthe button 133 is pressed by a visitor. When pressed, the button 133 maytrigger one or more functions of the doorbell 130, as further describedbelow. The front button 133 and the light pipe 136 may have variousprofiles that may or may not match the profile of the faceplate 135. Thelight pipe 136 may comprise any suitable material, including, withoutlimitation, transparent plastic, that is capable of allowing lightproduced within the doorbell 130 to pass through. The light may beproduced by one or more light-emitting components, such aslight-emitting diodes (LED's) 156 (FIG. 12), contained within thedoorbell 130. In some aspects of the present embodiments, when thebattery 166 of the doorbell 130 is recharged through a connection to ACmains power, the LEDs 156 may emit light to indicate that the battery166 is being recharged.

With further reference to FIG. 7, the doorbell 130 further includes anenclosure 131 that engages the faceplate 135 in some aspects of thepresent embodiments. In the illustrated embodiment, the enclosure 131abuts an upper edge 135T of the faceplate 135, but in alternativeembodiments one or more gaps between the enclosure 131 and the faceplate135 may facilitate the passage of sound and/or light through thedoorbell 130. The enclosure 131 may comprise any suitable material, butin some embodiments the material of the enclosure 131 preferably permitsinfrared light to pass through from inside the doorbell 130 to theenvironment and vice versa. The doorbell 130 further includes a lens132. In some embodiments, the lens may comprise a Fresnel lens, whichmay be patterned to deflect incoming light into one or more infraredsensors located within the doorbell 130. The doorbell 130 furtherincludes a camera 134, which captures video data when activated, asdescribed above and below.

FIG. 8 is an upper front perspective view of a security camera accordingto an aspect of the present embodiments. This figure illustrates thatthe security camera 330, similar to the video doorbell 130, includes afaceplate 135 that is mounted to a back plate 139 and an enclosure 131that engages the faceplate 135. Collectively, the faceplate 135, theback plate 139, and the enclosure 131 form a housing that contains andprotects the inner components of the security camera 330. However,unlike the video doorbell 130, the security camera 330 does not includeany front button 133 for activating the doorbell. The faceplate 135 maycomprise any suitable material, including, without limitation, metals,such as brushed aluminum or stainless steel, metal alloys, or plastics.The faceplate 135 protects the internal contents of the security camera330 and serves as an exterior front surface of the security camera 330.

With continued reference to FIG. 8, the enclosure 131 engages thefaceplate 135 and abuts an upper edge 135T of the faceplate 135. Asdiscussed above with reference to FIG. 7, in alternative embodiments,one or more gaps between the enclosure 131 and the faceplate 135 mayfacilitate the passage of sound and/or light through the security camera330. The enclosure 131 may comprise any suitable material, but in someembodiments the material of the enclosure 131 preferably permitsinfrared light to pass through from inside the security camera 330 tothe environment and vice versa. The security camera 330 further includesa lens 132. Again, similar to the video doorbell 130, in someembodiments, the lens may comprise a Fresnel lens, which may bepatterned to deflect incoming light into one or more infrared sensorslocated within the security camera 330. The security camera 330 furtherincludes a camera 134, which captures video data when activated, asdescribed above and below.

With reference to FIG. 8, the enclosure 131 may extend from the front ofthe security camera 330 around to the back thereof and may fit snuglyaround a lip (not shown) of the back plate 139. The back plate 139 maycomprise any suitable material, including, without limitation, metals,such as brushed aluminum or stainless steel, metal alloys, or plastics.The back plate 139 protects the internal contents of the security camera330 and serves as an exterior rear surface of the security camera 330.The faceplate 135 may extend from the front of the security camera 330and at least partially wrap around the back plate 139, thereby allowinga coupled connection between the faceplate 135 and the back plate 139.The back plate 139 may have indentations (not shown) in its structure tofacilitate the coupling.

With continued reference to FIG. 8, the security camera 330 furthercomprises a mounting apparatus 137. The mounting apparatus 137facilitates mounting the security camera 330 to a surface, such as aninterior or exterior wall of a building, such as a home or office. Thefaceplate 135 may extend from the bottom of the security camera 330 upto just below the camera 134, and connect to the back plate 139 asdescribed above. The lens 132 may extend and curl partially around theside of the security camera 330. The enclosure 131 may extend and curlaround the side and top of the security camera 330, and may be coupledto the back plate 139 as described above. The camera 134 may protrudefrom the enclosure 131, thereby giving it a wider field of view. Themounting apparatus 137 may couple with the back plate 139, therebycreating an assembly including the security camera 330 and the mountingapparatus 137. The couplings described in this paragraph, and elsewhere,may be secured by, for example and without limitation, screws,interference fittings, adhesives, or other fasteners. Interferencefittings may refer to a type of connection where a material relies onpressure and/or gravity coupled with the material's physical strength tosupport a connection to a different element.

FIG. 9 is a flowchart illustrating a process for streaming and storingA/V content from the A/V recording and communication device 100according to various aspects of the present disclosure. At block B200,the A/V recording and communication device 100 detects the visitor'spresence and captures video images within a field of view of the camera102. The A/V recording and communication device 100 may also captureaudio through the microphone 104. As described above, the A/V recordingand communication device 100 may detect the visitor's presence bydetecting motion using the camera 102 and/or a motion sensor, and/or bydetecting that the visitor has pressed a front button of the A/Vrecording and communication device 100 (if the A/V recording andcommunication device 100 is a doorbell). Also as described above, thevideo recording/capture may begin when the visitor is detected, or maybegin earlier, as described below.

At block B202, a communication module of the A/V recording andcommunication device 100 sends a request, via the user's network 110 andthe network 112, to a device in the network 112. For example, thenetwork device to which the request is sent may be a server such as theserver 118. The server 118 may comprise a computer program and/or amachine that waits for requests from other machines or software(clients) and responds to them. A server typically processes data. Onepurpose of a server is to share data and/or hardware and/or softwareresources among clients. This architecture is called the client-servermodel. The clients may run on the same computer or may connect to theserver over a network. Examples of computing servers include databaseservers, file servers, mail servers, print servers, web servers, gameservers, and application servers. The term server may be construedbroadly to include any computerized process that shares a resource toone or more client processes. In another example, the network device towhich the request is sent may be an API such as the backend API 120,which is described above.

In response to the request, at block B204 the network device may connectthe A/V recording and communication device 100 to the user's clientdevice 114 through the user's network 110 and the network 112. At blockB206, the A/V recording and communication device 100 may recordavailable audio and/or video data using the camera 102, the microphone104, and/or any other device/sensor available. At block B208, the audioand/or video data is transmitted (streamed) from the A/V recording andcommunication device 100 to the user's client device 114 via the user'snetwork 110 and the network 112. At block B210, the user may receive anotification on his or her client device 114 with a prompt to eitheraccept or deny the call.

At block B212, the process determines whether the user has accepted ordenied the call. If the user denies the notification, then the processadvances to block B214, where the audio and/or video data is recordedand stored at a cloud server. The session then ends at block B216 andthe connection between the A/V recording and communication device 100and the user's client device 114 is terminated. If, however, the useraccepts the notification, then at block B218 the user communicates withthe visitor through the user's client device 114 while audio and/orvideo data captured by the camera 102, the microphone 104, and/or otherdevices/sensors is streamed to the user's client device 114. At the endof the call, the user may terminate the connection between the user'sclient device 114 and the A/V recording and communication device 100 andthe session ends at block B216. In some embodiments, the audio and/orvideo data may be recorded and stored at a cloud server (block B214)even if the user accepts the notification and communicates with thevisitor through the user's client device 114.

FIG. 10 is a flowchart illustrating another process for an A/V recordingand communication device according to an aspect of the presentdisclosure. At block B400, the user may select a “snooze time-out,”which is a time period during which the doorbell 130 may deactivate orotherwise not respond to stimuli (such as light, sound, or heatsignatures) after an operation is performed, e.g., a notification iseither accepted or denied/ignored. For example, the user may set asnooze time-out of 15 minutes.

At block B402, an object moves into the field of view of one or more ofthe PIR sensors 144. At block B404, the microcontroller 163 may triggerthe communication module 164 to send a request to a network device. Inblock B406, the network device may connect the doorbell 130 to theuser's client device 114 through the user's network 110 and the network112. At block B408, audio/video data captured by the doorbell 130 may bestreamed to the user's client device 114. At block B410, the user mayreceive a notification prompting the user to either accept ordeny/ignore the request. If the request is denied or ignored, then atblock B412 b audio/video data may be recorded and stored at a cloudserver 118.

After the doorbell 130 finishes recording, the objects may remain in thePIR sensor 144 field of view at block B414. In block B416, themicrocontroller 163 waits for the “snooze time” to elapse, e.g. 15minutes, before triggering the communication module 164 to submitanother request to the network device. After the snooze time, e.g. 15minutes, elapses, the process moves back to block B404 and progresses asdescribed above. The cycle may continue like this until the user acceptsthe notification request at block B410. The process then moves to blockB412 a, where live audio and/or video data is displayed on the user'sclient device 114, thereby allowing the user surveillance from theperspective of the doorbell 130.

At the user's request, the connection may be severed and the sessionends at block B418. At this point the user may elect for the process torevert back to block B416, whereby there may be no further responseuntil the snooze time, e.g. 15 minutes, has elapsed from the end of theprevious session, or the user may elect for the process to return toblock B402 and receive a notification the next time an object isperceived by one or more of the PIR sensors 144. In some embodiments,the audio and/or video data may be recorded and stored at a cloud server118 (block B412 b) even if the user accepts the notification andcommunicates with the visitor through the user's client device 114.

FIG. 11 is a functional block diagram of a client device 850 on whichthe present embodiments may be implemented according to various aspectsof the present disclosure. The user's client device 114 described withreference to FIG. 1 may include some or all of the components and/orfunctionality of the client device 850. The client device 850 maycomprise, for example, a smartphone.

With reference to FIG. 11, the client device 850 includes a processor852, a memory 854, a user interface 856, a communication module 858, anda dataport 860. These components are communicatively coupled together byan interconnect bus 862. The processor 852 may include any processorused in smartphones and/or portable computing devices, such as an ARMprocessor (a processor based on the RISC (reduced instruction setcomputer) architecture developed by Advanced RISC Machines (ARM).). Insome embodiments, the processor 852 may include one or more otherprocessors, such as one or more conventional microprocessors, and/or oneor more supplementary co-processors, such as math co-processors.

The memory 854 may include both operating memory, such as random accessmemory (RAM), as well as data storage, such as read-only memory (ROM),hard drives, flash memory, or any other suitable memory/storage element.The memory 854 may include removable memory elements, such as aCompactFlash card, a MultiMediaCard (MMC), and/or a Secure Digital (SD)card. In some embodiments, the memory 804 may comprise a combination ofmagnetic, optical, and/or semiconductor memory, and may include, forexample, RAM, ROM, flash drive, and/or a hard disk or drive. Theprocessor 852 and the memory 854 each may be, for example, locatedentirely within a single device, or may be connected to each other by acommunication medium, such as a USB port, a serial port cable, a coaxialcable, an Ethernet-type cable, a telephone line, a radio frequencytransceiver, or other similar wireless or wired medium or combination ofthe foregoing. For example, the processor 852 may be connected to thememory 854 via the dataport 860.

The user interface 856 may include any user interface or presentationelements suitable for a smartphone and/or a portable computing device,such as a keypad, a display screen, a touchscreen, a microphone, and aspeaker. The communication module 858 is configured to handlecommunication links between the client device 850 and other, externaldevices or receivers, and to route incoming/outgoing data appropriately.For example, inbound data from the dataport 860 may be routed throughthe communication module 858 before being directed to the processor 852,and outbound data from the processor 852 may be routed through thecommunication module 808 before being directed to the dataport 860. Thecommunication module 858 may include one or more transceiver modulescapable of transmitting and receiving data, and using, for example, oneor more protocols and/or technologies, such as GSM, UMTS (3GSM), IS-95(CDMA one), IS-2000 (CDMA 2000), LTE, FDMA, TDMA, W-CDMA, CDMA, OFDMA,Wi-Fi, WiMAX, or any other protocol and/or technology.

The dataport 860 may be any type of connector used for physicallyinterfacing with a smartphone and/or a portable computing device, suchas a mini-USB port or an IPHONE®/IPOD® 30-pin connector or LIGHTNING®connector. In other embodiments, the dataport 860 may include multiplecommunication channels for simultaneous communication with, for example,other processors, servers, and/or client terminals.

The memory 854 may store instructions for communicating with othersystems, such as a computer. The memory 854 may store, for example, aprogram (e.g., computer program code) adapted to direct the processor852 in accordance with the present embodiments. The instructions alsomay include program elements, such as an operating system. Whileexecution of sequences of instructions in the program causes theprocessor 852 to perform the process steps described herein, hard-wiredcircuitry may be used in place of, or in combination with,software/firmware instructions for implementation of the processes ofthe present embodiments. Thus, the present embodiments are not limitedto any specific combination of hardware and software.

FIG. 12 is a functional block diagram of the components within or incommunication with the doorbell 130, according to an aspect of thepresent embodiments. As described above, the bracket PCB 149 maycomprise an accelerometer 150, a barometer 151, a humidity sensor 152,and a temperature sensor 153. The accelerometer 150 may be one or moresensors capable of sensing motion and/or acceleration. The barometer 151may be one or more sensors capable of determining the atmosphericpressure of the surrounding environment in which the bracket PCB 149 maybe located. The humidity sensor 152 may be one or more sensors capableof determining the amount of moisture present in the atmosphericenvironment in which the bracket PCB 149 may be located. The temperaturesensor 153 may be one or more sensors capable of determining thetemperature of the ambient environment in which the bracket PCB 149 maybe located. As described above, the bracket PCB 149 may be locatedoutside the housing of the doorbell 130 so as to reduce interferencefrom heat, pressure, moisture, and/or other stimuli generated by theinternal components of the doorbell 130.

With further reference to FIG. 12, the bracket PCB 149 may furthercomprise terminal screw inserts 154, which may be configured to receivethe terminal screws 138 and transmit power to electrical contacts on themounting bracket 137 (FIG. 8). The bracket PCB 149 may be electricallyand/or mechanically coupled to the power PCB 148 through terminalscrews, the terminal screw inserts 154, the spring contacts 140, and theelectrical contacts. The terminal screws may receive electrical wireslocated at the surface to which the doorbell 130 is mounted, such as thewall of a building, so that the doorbell can receive electrical powerfrom the building's electrical system. Upon the terminal screws beingsecured within the terminal screw inserts 154, power may be transferredto the bracket PCB 149, and to all of the components associatedtherewith, including the electrical contacts. The electrical contactsmay transfer electrical power to the power PCB 148 by mating with thespring contacts 140.

With further reference to FIG. 12, the front PCB 146 may comprise alight sensor 155, one or more light-emitting components, such as LED's156, one or more speakers 157, and a microphone 158. The light sensor155 may be one or more sensors capable of detecting the level of ambientlight of the surrounding environment in which the doorbell 130 may belocated. LED's 156 may be one or more light-emitting diodes capable ofproducing visible light when supplied with power. The speakers 157 maybe any electromechanical device capable of producing sound in responseto an electrical signal input. The microphone 158 may be anacoustic-to-electric transducer or sensor capable of converting soundwaves into an electrical signal. When activated, the LED's 156 mayilluminate the light pipe 136 (FIG. 7). The front PCB 146 and allcomponents thereof may be electrically coupled to the power PCB 148,thereby allowing data and/or power to be transferred to and from thepower PCB 148 and the front PCB 146.

The speakers 157 and the microphone 158 may be coupled to the cameraprocessor 170 through an audio CODEC 161. For example, the transfer ofdigital audio from the user's client device 114 and the speakers 157 andthe microphone 158 may be compressed and decompressed using the audioCODEC 161, coupled to the camera processor 170. Once compressed by audioCODEC 161, digital audio data may be sent through the communicationmodule 164 to the network 112, routed by one or more servers 118, anddelivered to the user's client device 114. When the user speaks, afterbeing transferred through the network 112, digital audio data isdecompressed by audio CODEC 161 and emitted to the visitor via thespeakers 157.

With further reference to FIG. 12, the power PCB 148 may comprise apower management module 162, a microcontroller 163, the communicationmodule 164, and power PCB non-volatile memory 165. In certainembodiments, the power management module 162 may comprise an integratedcircuit capable of arbitrating between multiple voltage rails, therebyselecting the source of power for the doorbell 130. The battery 166, thespring contacts 140, and/or the connector 160 may each provide power tothe power management module 162. The power management module 162 mayhave separate power rails dedicated to the battery 166, the springcontacts 140, and the connector 160. In one aspect of the presentdisclosure, the power management module 162 may continuously draw powerfrom the battery 166 to power the doorbell 130, while at the same timerouting power from the spring contacts 140 and/or the connector 160 tothe battery 166, thereby allowing the battery 166 to maintain asubstantially constant level of charge. Alternatively, the powermanagement module 162 may continuously draw power from the springcontacts 140 and/or the connector 160 to power the doorbell 130, whileonly drawing from the battery 166 when the power from the springcontacts 140 and/or the connector 160 is low or insufficient. The powermanagement module 162 may also serve as a conduit for data between theconnector 160 and the microcontroller 163.

With further reference to FIG. 12, in certain embodiments themicrocontroller 163 may comprise an integrated circuit including aprocessor core, memory, and programmable input/output peripherals. Themicrocontroller 163 may receive input signals, such as data and/orpower, from the PIR sensors 144, the bracket PCB 149, the powermanagement module 162, the light sensor 155, the microphone 158, and/orthe communication module 164, and may perform various functions asfurther described below. When the microcontroller 163 is triggered bythe PIR sensors 144, the microcontroller 163 may be triggered to performone or more functions, such as those described above. When the lightsensor 155 detects a low level of ambient light, the light sensor 155may trigger the microcontroller 163 to enable “night vision.”. Themicrocontroller 163 may also act as a conduit for data communicatedbetween various components and the communication module 164.

With further reference to FIG. 12, the communication module 164 maycomprise an integrated circuit including a processor core, memory, andprogrammable input/output peripherals. The communication module 164 mayalso be configured to transmit data wirelessly to a remote networkdevice, and may include one or more transceivers (not shown). Thewireless communication may comprise one or more wireless networks, suchas, without limitation, Wi-Fi, cellular, Bluetooth, and/or satellitenetworks. The communication module 164 may receive inputs, such as powerand/or data, from the camera PCB 147, the microcontroller 163, thebutton 133, the reset button 159, and/or the power PCB non-volatilememory 165. When the button 133 is pressed, the communication module 164may be triggered to perform one or more functions, such as thosedescribed above with reference to FIG. 9. When the reset button 159 ispressed, the communication module 164 may be triggered to erase any datastored at the power PCB non-volatile memory 165 and/or at the camera PCBmemory 169. The communication module 164 may also act as a conduit fordata communicated between various components and the microcontroller163. The power PCB non-volatile memory 165 may comprise flash memoryconfigured to store and/or transmit data. For example, in certainembodiments the power PCB non-volatile memory 165 may comprise serialperipheral interface (SPI) flash memory.

With further reference to FIG. 12, the camera PCB 147 may comprisecomponents that facilitate the operation of the camera 134. For example,an imager 171 may comprise a video recording sensor and/or a camerachip. In one aspect of the present disclosure, the imager 171 maycomprise a complementary metal-oxide semiconductor (CMOS) array, and maybe capable of recording high definition (720p or better) video files. Acamera processor 170 may comprise an encoding and compression chip. Insome embodiments, the camera processor 170 may comprise a bridgeprocessor. The camera processor 170 may process video recorded by theimager 171 and audio recorded by the microphone 158, and may transformthis data into a form suitable for wireless transfer by thecommunication module 164 to a network. The camera PCB memory 169 maycomprise volatile memory that may be used when data is being buffered orencoded by the camera processor 170. For example, in certain embodimentsthe camera PCB memory 169 may comprise synchronous dynamic random accessmemory (SD RAM). IR LED's 168 may comprise light-emitting diodes capableof radiating infrared light. IR cut filter 167 may comprise a systemthat, when triggered, configures the imager 171 to see primarilyinfrared light as opposed to visible light. When the light sensor 155detects a low level of ambient light (which may comprise a level thatimpedes the performance of the imager 171 in the visible spectrum), theIR LED's 168 may shine infrared light through the doorbell 130 enclosureout to the environment, and the IR cut filter 167 may enable the imager171 to see this infrared light as it is reflected or refracted off ofobjects within the field of view of the doorbell. This process mayprovide the doorbell 130 with the “night vision” function mentionedabove.

FIG. 13 is a functional block diagram of a general-purpose computingsystem on which the present embodiments may be implemented according tovarious aspects of the present disclosure. The computer system 1000 maybe embodied in at least one of a personal computer (also referred to asa desktop computer) 1000A, a portable computer (also referred to as alaptop or notebook computer) 1000B, and/or a server 1000C. A server is acomputer program and/or a machine that waits for requests from othermachines or software (clients) and responds to them. A server typicallyprocesses data. The purpose of a server is to share data and/or hardwareand/or software resources among clients. This architecture is called theclient-server model. The clients may run on the same computer or mayconnect to the server over a network. Examples of computing serversinclude database servers, file servers, mail servers, print servers, webservers, game servers, and application servers. The term server may beconstrued broadly to include any computerized process that shares aresource to one or more client processes.

The computer system 1000 may execute at least some of the operationsdescribed above. The computer system 1000 may include at least oneprocessor 1010, memory 1020, at least one storage device 1030, andinput/output (I/O) devices 1040. Some or all of the components 1010,1020, 1030, 1040 may be interconnected via a system bus 1050. Theprocessor 1010 may be single- or multi-threaded and may have one or morecores. The processor 1010 may execute instructions, such as those storedin the memory 1020 and/or in the storage device 1030. Information may bereceived and output using one or more I/O devices 1040.

The memory 1020 may store information, and may be a computer-readablemedium, such as volatile or non-volatile memory. The storage device(s)1030 may provide storage for the system 1000, and may be acomputer-readable medium. In various aspects, the storage device(s) 1030may be a flash memory device, a hard disk device, an optical diskdevice, a tape device, or any other type of storage device.

The I/O devices 1040 may provide input/output operations for the system1000. The I/O devices 1040 may include a keyboard, a pointing device,and/or a microphone. The I/O devices 1040 may further include a displayunit for displaying graphical user interfaces, a speaker, and/or aprinter. External data may be stored in one or more accessible externaldatabases 1060.

The features of the present embodiments described herein may beimplemented in digital electronic circuitry, and/or in computerhardware, firmware, software, and/or in combinations thereof. Featuresof the present embodiments may be implemented in a computer programproduct tangibly embodied in an information carrier, such as amachine-readable storage device, and/or in a propagated signal, forexecution by a programmable processor. Embodiments of the present methodsteps may be performed by a programmable processor executing a programof instructions to perform functions of the described implementations byoperating on input data and generating output.

The features of the present embodiments described herein may beimplemented in one or more computer programs that are executable on aprogrammable system including at least one programmable processorcoupled to receive data and/or instructions from, and to transmit dataand/or instructions to, a data storage system, at least one inputdevice, and at least one output device. A computer program may include aset of instructions that may be used, directly or indirectly, in acomputer to perform a certain activity or bring about a certain result.A computer program may be written in any form of programming language,including compiled or interpreted languages, and it may be deployed inany form, including as a stand-alone program or as a module, component,subroutine, or other unit suitable for use in a computing environment.

Suitable processors for the execution of a program of instructions mayinclude, for example, both general and special purpose processors,and/or the sole processor or one of multiple processors of any kind ofcomputer. Generally, a processor may receive instructions and/or datafrom a read only memory (ROM), or a random access memory (RAM), or both.Such a computer may include a processor for executing instructions andone or more memories for storing instructions and/or data.

Generally, a computer may also include, or be operatively coupled tocommunicate with, one or more mass storage devices for storing datafiles. Such devices include magnetic disks, such as internal hard disksand/or removable disks, magneto-optical disks, and/or optical disks.Storage devices suitable for tangibly embodying computer programinstructions and/or data may include all forms of non-volatile memory,including for example semiconductor memory devices, such as EPROM,EEPROM, and flash memory devices, magnetic disks such as internal harddisks and removable disks, magneto-optical disks, and CD-ROM and DVD-ROMdisks. The processor and the memory may be supplemented by, orincorporated in, one or more ASICs (application-specific integratedcircuits).

To provide for interaction with a user, the features of the presentembodiments may be implemented on a computer having a display device,such as an LCD (liquid crystal display) monitor, for displayinginformation to the user. The computer may further include a keyboard, apointing device, such as a mouse or a trackball, and/or a touchscreen bywhich the user may provide input to the computer.

The features of the present embodiments may be implemented in a computersystem that includes a back-end component, such as a data server, and/orthat includes a middleware component, such as an application server oran Internet server, and/or that includes a front-end component, such asa client computer having a graphical user interface (GUI) and/or anInternet browser, or any combination of these. The components of thesystem may be connected by any form or medium of digital datacommunication, such as a communication network. Examples ofcommunication networks may include, for example, a LAN (local areanetwork), a WAN (wide area network), and/or the computers and networksforming the Internet.

The computer system may include clients and servers. A client and servermay be remote from each other and interact through a network, such asthose described herein. The relationship of client and server may ariseby virtue of computer programs running on the respective computers andhaving a client-server relationship to each other.

The above description presents the best mode contemplated for carryingout the present embodiments, and of the manner and process of practicingthem, in such full, clear, concise, and exact terms as to enable anyperson skilled in the art to which they pertain to practice theseembodiments. The present embodiments are, however, susceptible tomodifications and alternate constructions from those discussed abovethat are fully equivalent. Consequently, the present invention is notlimited to the particular embodiments disclosed. On the contrary, thepresent invention covers all modifications and alternate constructionscoming within the spirit and scope of the present disclosure. Forexample, the steps in the processes described herein need not beperformed in the same order as they have been presented, and may beperformed in any order(s). Further, steps that have been presented asbeing performed separately may in alternative embodiments be performedconcurrently. Likewise, steps that have been presented as beingperformed concurrently may in alternative embodiments be performedseparately.

What is claimed is:
 1. For an audio/video (A/V) recording andcommunication device comprising a camera having a field of view, amethod for notifying a user of a threat level associated with a personwithin the field of view of the camera, the method comprising:receiving, from the camera, identification data for the person;transmitting the received identification data to at least one backendserver; receiving, from the backend server, information about a threatlevel associated with the person; and notifying the user of the threatlevel.
 2. The method of claim 1, wherein notifying the user of thethreat level comprises providing one of a plurality of different threatlevel notification types based on different threat levels.
 3. The methodof claim 1, wherein notifying the user of the threat level comprisesproviding a visual notification to the user.
 4. The method of claim 3,wherein the A/V recording and communication device is associated with astructure having at least one colored light, wherein providing thevisual notification comprises selecting a particular color from a set ofcolors for the colored light.
 5. The method of claim 4, wherein eachcolor in the set of colors corresponds to a different threat level. 6.The method of claim 1, wherein notifying the user of the threat levelcomprises providing an audible notification.
 7. The method of claim 1,wherein notifying the user of the threat level comprises providing anotification on a user's device.
 8. The method of claim 7, wherein theuser's device is a smartphone.
 9. The method of claim 1, wherein the A/Vrecording and communication device is a doorbell.
 10. The method ofclaim 1, wherein the A/V recording and communication device is asecurity camera.
 11. The method of claim 1, wherein receiving theidentification data for the person comprises: detecting a movement ofthe person; and capturing video images, by the camera, of the person.12. The method of claim 11, wherein the identification data for theperson comprises one or more video images of the person.
 13. The methodof claim 1 further comprising notifying other residents of a sameneighborhood, in which the audio/video recording and communicationdevice is located, of the threat level.
 14. The method of claim 13,wherein notifying the other residents comprises notifying the otherresidents through streetlights located in the neighborhood.
 15. Anaudio/video (A/V) recording and communication device, comprising: acamera configured to record video image data of an area about the A/Vrecording and communication device; one or more processors; acommunication module configured to transmit streaming video to a clientdevice; and a non-transitory machine readable medium storing a programwhich when executed by at least one of the processors notifies a user ofa threat level associated with a person within a field of view of thecamera, the program comprising sets of instructions for: receiving, fromthe camera, identification data for the person; transmitting thereceived identification data to at least one backend server; receiving,from the backend server, information about a threat level associatedwith the person; and notifying the user of the threat level.
 16. The A/Vrecording and communication device of claim 15, wherein the set ofinstructions for notifying the user of the threat level comprises a setof instructions for providing one of a plurality of different threatlevel notification types based on different threat levels.
 17. The A/Vrecording and communication device of claim 15, wherein the set ofinstructions for notifying the user of the threat level comprises a setof instructions for providing a visual notification.
 18. The A/Vrecording and communication device of claim 17, wherein the A/Vrecording and communication device is associated with a structure havingat least one colored light, wherein the set of instructions forproviding the visual notification comprises a set of instructions forselecting a particular color from a set of different colors for thecolored light.
 19. The A/V recording and communication device of claim18, wherein each color in the set of colors corresponds to a differentthreat level.
 20. The A/V recording and communication device of claim15, wherein the set of instructions for notifying the user of the threatlevel comprises a set of instructions for providing an audiblenotification to the user.
 21. The A/V recording and communication deviceof claim 15, wherein the set of instructions for notifying the user ofthe threat level comprises a set of instructions for providing anotification on a client device.
 22. The A/V recording and communicationdevice of claim 21, wherein the client device is a smartphone.
 23. TheA/V recording and communication device of claim 15, wherein the A/Vrecording and communication device is a doorbell.
 24. The A/V recordingand communication device of claim 15, wherein the A/V recording andcommunication device is a security camera.
 25. The A/V recording andcommunication device of claim 15, wherein the set of instructions forreceiving the identification data for the person comprises sets ofinstructions for: detecting a movement of the person within the field ofview of the camera; and capturing video images, by the camera, of theperson and a set of other objects that are within the field of view ofthe camera.
 26. The A/V recording and communication device of claim 25,wherein the identification data for the person comprises one or morevideo images of the person.
 27. The A/V recording and communicationdevice of claim 15 further comprising a set of instructions fornotifying other persons living in a same neighborhood, in which the A/Vrecording and communication device is located, of the threat level. 28.The A/V recording and communication device of claim 27, wherein the setof instructions for notifying the other persons comprises a set ofinstructions for notifying the other persons through streetlightslocated in the neighborhood.